AITopics | bridging planning and reinforcement learning

Search on the Replay Buffer: Bridging Planning and Reinforcement Learning

Neural Information Processing SystemsDec-25-2025, 10:51:42 GMT

The history of learning for control has been an exciting back and forth between two broad classes of algorithms: planning and reinforcement learning. Planning algorithms effectively reason over long horizons, but assume access to a local policy and distance metric over collision-free paths. Reinforcement learning excels at learning policies and relative values of states, but fails to plan over long horizons. Despite the successes of each method on various tasks, long horizon, sparse reward tasks with high-dimensional observations remain exceedingly challenging for both planning and reinforcement learning algorithms. Frustratingly, these sorts of tasks are potentially the most useful, as they are simple to design (a human only need to provide an example goal state) and avoid injecting bias through reward shaping.

algorithm, bridging planning and reinforcement learning, replay buffer, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Search on the Replay Buffer: Bridging Planning and Reinforcement Learning

Neural Information Processing SystemsMay-27-2025, 11:14:54 GMT

The history of learning for control has been an exciting back and forth between two broad classes of algorithms: planning and reinforcement learning. Planning algorithms effectively reason over long horizons, but assume access to a local policy and distance metric over collision-free paths. Reinforcement learning excels at learning policies and relative values of states, but fails to plan over long horizons. Despite the successes of each method on various tasks, long horizon, sparse reward tasks with high-dimensional observations remain exceedingly challenging for both planning and reinforcement learning algorithms. Frustratingly, these sorts of tasks are potentially the most useful, as they are simple to design (a human only need to provide an example goal state) and avoid injecting bias through reward shaping.

algorithm, bridging planning and reinforcement learning, replay buffer, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Reviews: Search on the Replay Buffer: Bridging Planning and Reinforcement Learning

Neural Information Processing SystemsJan-24-2025, 00:30:39 GMT

Compact search spaces would confer computational benefits if nothing else. Overall, studying how compact representations of the state might might compare when used inside graph search seems like a nice way to evaluate just how much utility is added by the distributional RL component of the overall approach.

bridging planning and reinforcement learning, evaluation, replay buffer, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.40)

Add feedback

Reviews: Search on the Replay Buffer: Bridging Planning and Reinforcement Learning

Neural Information Processing SystemsJan-24-2025, 00:30:28 GMT

The paper presents a general-purpose control algorithm combining planning and RL to solve tasks with sparse rewards or with long horizon. This algorithm is novel and interesting. The three reviewers agree that the contributions presented here should be published at the conference. The rebuttal helped solving most clarification issues. The reviewers also suggest various ways to further improve the manuscript.

algorithm, bridging planning and reinforcement learning, replay buffer

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Search on the Replay Buffer: Bridging Planning and Reinforcement Learning

Neural Information Processing SystemsOct-10-2024, 02:51:01 GMT

The history of learning for control has been an exciting back and forth between two broad classes of algorithms: planning and reinforcement learning. Planning algorithms effectively reason over long horizons, but assume access to a local policy and distance metric over collision-free paths. Reinforcement learning excels at learning policies and relative values of states, but fails to plan over long horizons. Despite the successes of each method on various tasks, long horizon, sparse reward tasks with high-dimensional observations remain exceedingly challenging for both planning and reinforcement learning algorithms. Frustratingly, these sorts of tasks are potentially the most useful, as they are simple to design (a human only need to provide an example goal state) and avoid injecting bias through reward shaping.

algorithm, bridging planning and reinforcement learning, replay buffer, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Search on the Replay Buffer: Bridging Planning and Reinforcement Learning

Eysenbach, Ben, Salakhutdinov, Russ R., Levine, Sergey

Neural Information Processing SystemsMar-19-2020, 03:02:04 GMT

The history of learning for control has been an exciting back and forth between two broad classes of algorithms: planning and reinforcement learning. Planning algorithms effectively reason over long horizons, but assume access to a local policy and distance metric over collision-free paths. Reinforcement learning excels at learning policies and relative values of states, but fails to plan over long horizons. Despite the successes of each method on various tasks, long horizon, sparse reward tasks with high-dimensional observations remain exceedingly challenging for both planning and reinforcement learning algorithms. Frustratingly, these sorts of tasks are potentially the most useful, as they are simple to design (a human only need to provide an example goal state) and avoid injecting bias through reward shaping.

algorithm, bridging planning and reinforcement learning, replay buffer, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Filters

Collaborating Authors

bridging planning and reinforcement learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Search on the Replay Buffer: Bridging Planning and Reinforcement Learning

Search on the Replay Buffer: Bridging Planning and Reinforcement Learning

Reviews: Search on the Replay Buffer: Bridging Planning and Reinforcement Learning

Reviews: Search on the Replay Buffer: Bridging Planning and Reinforcement Learning

Search on the Replay Buffer: Bridging Planning and Reinforcement Learning

Search on the Replay Buffer: Bridging Planning and Reinforcement Learning